50 research outputs found

    A Computationally Efficient Projection-Based Approach for Spatial Generalized Linear Mixed Models

    Full text link
    Inference for spatial generalized linear mixed models (SGLMMs) for high-dimensional non-Gaussian spatial data is computationally intensive. The computational challenge is due to the high-dimensional random effects and because Markov chain Monte Carlo (MCMC) algorithms for these models tend to be slow mixing. Moreover, spatial confounding inflates the variance of fixed effect (regression coefficient) estimates. Our approach addresses both the computational and confounding issues by replacing the high-dimensional spatial random effects with a reduced-dimensional representation based on random projections. Standard MCMC algorithms mix well and the reduced-dimensional setting speeds up computations per iteration. We show, via simulated examples, that Bayesian inference for this reduced-dimensional approach works well both in terms of inference as well as prediction, our methods also compare favorably to existing "reduced-rank" approaches. We also apply our methods to two real world data examples, one on bird count data and the other classifying rock types

    Unsupervised Semantic Representation Learning of Scientific Literature Based on Graph Attention Mechanism and Maximum Mutual Information

    Full text link
    Since most scientific literature data are unlabeled, this makes unsupervised graph-based semantic representation learning crucial. Therefore, an unsupervised semantic representation learning method of scientific literature based on graph attention mechanism and maximum mutual information (GAMMI) is proposed. By introducing a graph attention mechanism, the weighted summation of nearby node features make the weights of adjacent node features entirely depend on the node features. Depending on the features of the nearby nodes, different weights can be applied to each node in the graph. Therefore, the correlations between vertex features can be better integrated into the model. In addition, an unsupervised graph contrastive learning strategy is proposed to solve the problem of being unlabeled and scalable on large-scale graphs. By comparing the mutual information between the positive and negative local node representations on the latent space and the global graph representation, the graph neural network can capture both local and global information. Experimental results demonstrate competitive performance on various node classification benchmarks, achieving good results and sometimes even surpassing the performance of supervised learning

    Efficient Partitioning Method of Large-Scale Public Safety Spatio-Temporal Data based on Information Loss Constraints

    Full text link
    The storage, management, and application of massive spatio-temporal data are widely applied in various practical scenarios, including public safety. However, due to the unique spatio-temporal distribution characteristics of re-al-world data, most existing methods have limitations in terms of the spatio-temporal proximity of data and load balancing in distributed storage. There-fore, this paper proposes an efficient partitioning method of large-scale public safety spatio-temporal data based on information loss constraints (IFL-LSTP). The IFL-LSTP model specifically targets large-scale spatio-temporal point da-ta by combining the spatio-temporal partitioning module (STPM) with the graph partitioning module (GPM). This approach can significantly reduce the scale of data while maintaining the model's accuracy, in order to improve the partitioning efficiency. It can also ensure the load balancing of distributed storage while maintaining spatio-temporal proximity of the data partitioning results. This method provides a new solution for distributed storage of mas-sive spatio-temporal data. The experimental results on multiple real-world da-tasets demonstrate the effectiveness and superiority of IFL-LSTP

    Assessing the Impact of Retreat Mechanisms in a Simple Antarctic Ice Sheet Model Using Bayesian Calibration

    Get PDF
    The response of the Antarctic ice sheet (AIS) to changing climate forcings is an important driver of sea-level changes. Anthropogenic climate change may drive a sizeable AIS tipping point response with subsequent increases in coastal flooding risks. Many studies analyzing flood risks use simple models to project the future responses of AIS and its sea-level contributions. These analyses have provided important new insights, but they are often silent on the effects of potentially important processes such as Marine Ice Sheet Instability (MISI) or Marine Ice Cliff Instability (MICI). These approximations can be well justified and result in more parsimonious and transparent model structures. This raises the question of how this approximation impacts hindcasts and projections. Here, we calibrate a previously published and relatively simple AIS model, which neglects the effects of MICI and regional characteristics, using a combination of observational constraints and a Bayesian inversion method. Specifically, we approximate the effects of missing MICI by comparing our results to those from expert assessments with more realistic models and quantify the bias during the last interglacial when MICI may have been triggered. Our results suggest that the model can approximate the process of MISI and reproduce the projected median melt from some previous expert assessments in the year 2100. Yet, our mean hindcast is roughly 3/4 of the observed data during the last interglacial period and our mean projection is roughly 1/6 and 1/10 of the mean from a model accounting for MICI in the year 2100. These results suggest that missing MICI and/or regional characteristics can lead to a low-bias during warming period AIS melting and hence a potential low-bias in projected sea levels and flood risks.Comment: v1: 16 pages, 4 figures, 7 supplementary files; v2: 15 pages, 4 figures, 7 supplementary files, corrected typos, revised title, updated according to revisions made through publication proces

    Kryging: Geostatistical analysis of large-scale datasets using Krylov subspace methods

    Get PDF
    Analyzing massive spatial datasets using a Gaussian process model poses computational challenges. This is a problem prevailing heavily in applications such as environmental modeling, ecology, forestry and environmental health. We present a novel approximate inference methodology that uses profile likelihood and Krylov subspace methods to estimate the spatial covariance parameters and makes spatial predictions with uncertainty quantification for point-referenced spatial data. The proposed method, Kryging, applies for both observations on regular grid and irregularly-spaced observations, and for any Gaussian process with a stationary isotropic (and certain geometrically anisotropic) covariance function, including the popular Matérn covariance family. We make use of the block Toeplitz structure with Toeplitz blocks of the covariance matrix and use fast Fourier transform methods to bypass the computational and memory bottlenecks of approximating log-determinant and matrix-vector products. We perform extensive simulation studies to show the effectiveness of our model by varying sample sizes, spatial parameter values and sampling designs. A real data application is also performed on a dataset consisting of land surface temperature readings taken by the MODIS satellite. Compared to existing methods, the proposed method performs satisfactorily with much less computation time and better scalability

    Kryging: Geostatistical analysis of large-scale datasets using Krylov subspace methods

    Get PDF
    Analyzing massive spatial datasets using Gaussian process model poses computational challenges. This is a problem prevailing heavily in applications such as environmental modeling, ecology, forestry and environmental heath. We present a novel approximate inference methodology that uses profile likelihood and Krylov subspace methods to estimate the spatial covariance parameters and makes spatial predictions with uncertainty quantification. The proposed method, Kryging, applies for both observations on regular grid and irregularly-spaced observations, and for any Gaussian process with a stationary covariance function, including the popular \Matern covariance family. We make use of the block Toeplitz structure with Toeplitz blocks of the covariance matrix and use fast Fourier transform methods to alleviate the computational and memory bottlenecks. We perform extensive simulation studies to show the effectiveness of our model by varying sample sizes, spatial parameter values and sampling designs. A real data application is also performed on a dataset consisting of land surface temperature readings taken by the MODIS satellite. Compared to existing methods, the proposed method performs satisfactorily with much less computation time and better scalability
    corecore